Overview

Dataset statistics

Number of variables33
Number of observations119390
Missing cells130601
Missing cells (%)3.3%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory30.1 MiB
Average record size in memory264.0 B

Variable types

Numeric15
Categorical18

Alerts

country has a high cardinality: 177 distinct values High cardinality
reservation_status_date has a high cardinality: 926 distinct values High cardinality
arrival_date_year is highly correlated with arrival_date_week_numberHigh correlation
arrival_date_week_number is highly correlated with arrival_date_yearHigh correlation
is_repeated_guest is highly correlated with previous_bookings_not_canceledHigh correlation
previous_bookings_not_canceled is highly correlated with is_repeated_guestHigh correlation
arrival_date_year is highly correlated with arrival_date_week_numberHigh correlation
arrival_date_week_number is highly correlated with arrival_date_yearHigh correlation
is_repeated_guest is highly correlated with previous_bookings_not_canceledHigh correlation
previous_bookings_not_canceled is highly correlated with is_repeated_guestHigh correlation
market_segment is highly correlated with distribution_channelHigh correlation
reserved_room_type is highly correlated with assigned_room_typeHigh correlation
reservation_status is highly correlated with is_canceledHigh correlation
is_canceled is highly correlated with reservation_statusHigh correlation
assigned_room_type is highly correlated with reserved_room_typeHigh correlation
distribution_channel is highly correlated with market_segmentHigh correlation
Unnamed: 0 is highly correlated with hotel and 6 other fieldsHigh correlation
hotel is highly correlated with Unnamed: 0 and 3 other fieldsHigh correlation
is_canceled is highly correlated with Unnamed: 0 and 1 other fieldsHigh correlation
lead_time is highly correlated with companyHigh correlation
arrival_date_year is highly correlated with Unnamed: 0 and 2 other fieldsHigh correlation
arrival_date_month is highly correlated with Unnamed: 0 and 2 other fieldsHigh correlation
arrival_date_week_number is highly correlated with Unnamed: 0 and 3 other fieldsHigh correlation
stays_in_weekend_nights is highly correlated with stays_in_week_nights and 1 other fieldsHigh correlation
stays_in_week_nights is highly correlated with stays_in_weekend_nightsHigh correlation
children is highly correlated with reserved_room_typeHigh correlation
market_segment is highly correlated with distribution_channel and 1 other fieldsHigh correlation
distribution_channel is highly correlated with market_segmentHigh correlation
previous_cancellations is highly correlated with previous_bookings_not_canceledHigh correlation
previous_bookings_not_canceled is highly correlated with previous_cancellationsHigh correlation
reserved_room_type is highly correlated with children and 1 other fieldsHigh correlation
assigned_room_type is highly correlated with hotel and 1 other fieldsHigh correlation
booking_changes is highly correlated with stays_in_weekend_nightsHigh correlation
deposit_type is highly correlated with reservation_statusHigh correlation
agent is highly correlated with hotel and 1 other fieldsHigh correlation
company is highly correlated with Unnamed: 0 and 5 other fieldsHigh correlation
reservation_status is highly correlated with Unnamed: 0 and 2 other fieldsHigh correlation
agent has 16340 (13.7%) missing values Missing
company has 112593 (94.3%) missing values Missing
previous_cancellations is highly skewed (γ1 = 24.45804872) Skewed
previous_bookings_not_canceled is highly skewed (γ1 = 23.53979995) Skewed
Unnamed: 0 is uniformly distributed Uniform
Unnamed: 0 has unique values Unique
lead_time has 6345 (5.3%) zeros Zeros
stays_in_weekend_nights has 51998 (43.6%) zeros Zeros
stays_in_week_nights has 7645 (6.4%) zeros Zeros
previous_cancellations has 112906 (94.6%) zeros Zeros
previous_bookings_not_canceled has 115770 (97.0%) zeros Zeros
booking_changes has 101314 (84.9%) zeros Zeros
days_in_waiting_list has 115692 (96.9%) zeros Zeros
adr has 1959 (1.6%) zeros Zeros
total_of_special_requests has 70318 (58.9%) zeros Zeros

Reproduction

Analysis started2022-07-25 13:16:14.054874
Analysis finished2022-07-25 13:16:51.590928
Duration37.54 seconds
Software versionpandas-profiling v3.2.0
Download configurationconfig.json

Variables

Unnamed: 0
Real number (ℝ≥0)

HIGH CORRELATION
UNIFORM
UNIQUE

Distinct119390
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean59694.5
Minimum0
Maximum119389
Zeros1
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size932.9 KiB
2022-07-25T08:16:51.653714image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile5969.45
Q129847.25
median59694.5
Q389541.75
95-th percentile113419.55
Maximum119389
Range119389
Interquartile range (IQR)59694.5

Descriptive statistics

Standard deviation34465.06866
Coefficient of variation (CV)0.577357523
Kurtosis-1.2
Mean59694.5
Median Absolute Deviation (MAD)29847.5
Skewness0
Sum7126926355
Variance1187840958
MonotonicityStrictly increasing
2022-07-25T08:16:51.743413image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
01
 
< 0.1%
795891
 
< 0.1%
796011
 
< 0.1%
796001
 
< 0.1%
795991
 
< 0.1%
795981
 
< 0.1%
795971
 
< 0.1%
795961
 
< 0.1%
795951
 
< 0.1%
795941
 
< 0.1%
Other values (119380)119380
> 99.9%
ValueCountFrequency (%)
01
< 0.1%
11
< 0.1%
21
< 0.1%
31
< 0.1%
41
< 0.1%
51
< 0.1%
61
< 0.1%
71
< 0.1%
81
< 0.1%
91
< 0.1%
ValueCountFrequency (%)
1193891
< 0.1%
1193881
< 0.1%
1193871
< 0.1%
1193861
< 0.1%
1193851
< 0.1%
1193841
< 0.1%
1193831
< 0.1%
1193821
< 0.1%
1193811
< 0.1%
1193801
< 0.1%

hotel
Categorical

HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size932.9 KiB
City Hotel
79330 
Resort Hotel
40060 

Length

Max length12
Median length10
Mean length10.67107798
Min length10

Characters and Unicode

Total characters1274020
Distinct characters12
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowCity Hotel
2nd rowCity Hotel
3rd rowCity Hotel
4th rowCity Hotel
5th rowCity Hotel

Common Values

ValueCountFrequency (%)
City Hotel79330
66.4%
Resort Hotel40060
33.6%

Length

2022-07-25T08:16:51.824222image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-07-25T08:16:51.900176image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
hotel119390
50.0%
city79330
33.2%
resort40060
 
16.8%

Most occurring characters

ValueCountFrequency (%)
t238780
18.7%
o159450
12.5%
e159450
12.5%
119390
9.4%
H119390
9.4%
l119390
9.4%
C79330
 
6.2%
i79330
 
6.2%
y79330
 
6.2%
R40060
 
3.1%
Other values (2)80120
 
6.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter915850
71.9%
Uppercase Letter238780
 
18.7%
Space Separator119390
 
9.4%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
t238780
26.1%
o159450
17.4%
e159450
17.4%
l119390
13.0%
i79330
 
8.7%
y79330
 
8.7%
s40060
 
4.4%
r40060
 
4.4%
Uppercase Letter
ValueCountFrequency (%)
H119390
50.0%
C79330
33.2%
R40060
 
16.8%
Space Separator
ValueCountFrequency (%)
119390
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin1154630
90.6%
Common119390
 
9.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
t238780
20.7%
o159450
13.8%
e159450
13.8%
H119390
10.3%
l119390
10.3%
C79330
 
6.9%
i79330
 
6.9%
y79330
 
6.9%
R40060
 
3.5%
s40060
 
3.5%
Common
ValueCountFrequency (%)
119390
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII1274020
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
t238780
18.7%
o159450
12.5%
e159450
12.5%
119390
9.4%
H119390
9.4%
l119390
9.4%
C79330
 
6.2%
i79330
 
6.2%
y79330
 
6.2%
R40060
 
3.1%
Other values (2)80120
 
6.3%

is_canceled
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size932.9 KiB
0
75166 
1
44224 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters119390
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
075166
63.0%
144224
37.0%

Length

2022-07-25T08:16:51.956057image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-07-25T08:16:52.018854image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
075166
63.0%
144224
37.0%

Most occurring characters

ValueCountFrequency (%)
075166
63.0%
144224
37.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number119390
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
075166
63.0%
144224
37.0%

Most occurring scripts

ValueCountFrequency (%)
Common119390
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
075166
63.0%
144224
37.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII119390
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
075166
63.0%
144224
37.0%

lead_time
Real number (ℝ≥0)

HIGH CORRELATION
ZEROS

Distinct479
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean104.0114164
Minimum0
Maximum737
Zeros6345
Zeros (%)5.3%
Negative0
Negative (%)0.0%
Memory size932.9 KiB
2022-07-25T08:16:52.083749image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q118
median69
Q3160
95-th percentile320
Maximum737
Range737
Interquartile range (IQR)142

Descriptive statistics

Standard deviation106.863097
Coefficient of variation (CV)1.027416997
Kurtosis1.696448849
Mean104.0114164
Median Absolute Deviation (MAD)60
Skewness1.346549873
Sum12417923
Variance11419.72151
MonotonicityNot monotonic
2022-07-25T08:16:52.162755image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
06345
 
5.3%
13460
 
2.9%
22069
 
1.7%
31816
 
1.5%
41715
 
1.4%
51565
 
1.3%
61445
 
1.2%
71331
 
1.1%
81138
 
1.0%
121079
 
0.9%
Other values (469)97427
81.6%
ValueCountFrequency (%)
06345
5.3%
13460
2.9%
22069
 
1.7%
31816
 
1.5%
41715
 
1.4%
51565
 
1.3%
61445
 
1.2%
71331
 
1.1%
81138
 
1.0%
9992
 
0.8%
ValueCountFrequency (%)
7371
 
< 0.1%
7091
 
< 0.1%
62917
< 0.1%
62630
< 0.1%
62217
< 0.1%
61517
< 0.1%
60817
< 0.1%
60530
< 0.1%
60117
< 0.1%
59417
< 0.1%

arrival_date_year
Categorical

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size932.9 KiB
2016
56707 
2017
40687 
2015
21996 

Length

Max length4
Median length4
Mean length4
Min length4

Characters and Unicode

Total characters477560
Distinct characters6
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2017
2nd row2017
3rd row2017
4th row2017
5th row2017

Common Values

ValueCountFrequency (%)
201656707
47.5%
201740687
34.1%
201521996
 
18.4%

Length

2022-07-25T08:16:52.239521image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-07-25T08:16:52.303779image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
201656707
47.5%
201740687
34.1%
201521996
 
18.4%

Most occurring characters

ValueCountFrequency (%)
2119390
25.0%
0119390
25.0%
1119390
25.0%
656707
11.9%
740687
 
8.5%
521996
 
4.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number477560
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
2119390
25.0%
0119390
25.0%
1119390
25.0%
656707
11.9%
740687
 
8.5%
521996
 
4.6%

Most occurring scripts

ValueCountFrequency (%)
Common477560
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
2119390
25.0%
0119390
25.0%
1119390
25.0%
656707
11.9%
740687
 
8.5%
521996
 
4.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII477560
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2119390
25.0%
0119390
25.0%
1119390
25.0%
656707
11.9%
740687
 
8.5%
521996
 
4.6%

arrival_date_month
Categorical

HIGH CORRELATION

Distinct12
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size932.9 KiB
August
13877 
July
12661 
May
11791 
October
11160 
April
11089 
Other values (7)
58812 

Length

Max length9
Median length7
Mean length5.903182846
Min length3

Characters and Unicode

Total characters704781
Distinct characters26
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowMarch
2nd rowMarch
3rd rowMarch
4th rowMarch
5th rowMarch

Common Values

ValueCountFrequency (%)
August13877
11.6%
July12661
10.6%
May11791
9.9%
October11160
9.3%
April11089
9.3%
June10939
9.2%
September10508
8.8%
March9794
8.2%
February8068
6.8%
November6794
5.7%
Other values (2)12709
10.6%

Length

2022-07-25T08:16:52.381800image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
august13877
11.6%
july12661
10.6%
may11791
9.9%
october11160
9.3%
april11089
9.3%
june10939
9.2%
september10508
8.8%
march9794
8.2%
february8068
6.8%
november6794
5.7%
Other values (2)12709
10.6%

Most occurring characters

ValueCountFrequency (%)
e95619
13.6%
r78190
 
11.1%
u65351
 
9.3%
b43310
 
6.1%
a41511
 
5.9%
y38449
 
5.5%
t35545
 
5.0%
J29529
 
4.2%
c27734
 
3.9%
A24966
 
3.5%
Other values (16)224577
31.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter585391
83.1%
Uppercase Letter119390
 
16.9%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e95619
16.3%
r78190
13.4%
u65351
11.2%
b43310
 
7.4%
a41511
 
7.1%
y38449
 
6.6%
t35545
 
6.1%
c27734
 
4.7%
m24082
 
4.1%
l23750
 
4.1%
Other values (8)111850
19.1%
Uppercase Letter
ValueCountFrequency (%)
J29529
24.7%
A24966
20.9%
M21585
18.1%
O11160
 
9.3%
S10508
 
8.8%
F8068
 
6.8%
N6794
 
5.7%
D6780
 
5.7%

Most occurring scripts

ValueCountFrequency (%)
Latin704781
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
e95619
13.6%
r78190
 
11.1%
u65351
 
9.3%
b43310
 
6.1%
a41511
 
5.9%
y38449
 
5.5%
t35545
 
5.0%
J29529
 
4.2%
c27734
 
3.9%
A24966
 
3.5%
Other values (16)224577
31.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII704781
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e95619
13.6%
r78190
 
11.1%
u65351
 
9.3%
b43310
 
6.1%
a41511
 
5.9%
y38449
 
5.5%
t35545
 
5.0%
J29529
 
4.2%
c27734
 
3.9%
A24966
 
3.5%
Other values (16)224577
31.9%

arrival_date_week_number
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct53
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean27.16517296
Minimum1
Maximum53
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size932.9 KiB
2022-07-25T08:16:52.454616image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile5
Q116
median28
Q338
95-th percentile49
Maximum53
Range52
Interquartile range (IQR)22

Descriptive statistics

Standard deviation13.60513836
Coefficient of variation (CV)0.500830176
Kurtosis-0.9860771763
Mean27.16517296
Median Absolute Deviation (MAD)11
Skewness-0.01001432604
Sum3243250
Variance185.0997897
MonotonicityNot monotonic
2022-07-25T08:16:52.537730image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
333580
 
3.0%
303087
 
2.6%
323045
 
2.6%
343040
 
2.5%
182926
 
2.5%
212854
 
2.4%
282853
 
2.4%
172805
 
2.3%
202785
 
2.3%
292763
 
2.3%
Other values (43)89652
75.1%
ValueCountFrequency (%)
11047
0.9%
21218
1.0%
31319
1.1%
41487
1.2%
51387
1.2%
61508
1.3%
72109
1.8%
82216
1.9%
92117
1.8%
102149
1.8%
ValueCountFrequency (%)
531816
1.5%
521195
1.0%
51933
0.8%
501505
1.3%
491782
1.5%
481504
1.3%
471685
1.4%
461574
1.3%
451941
1.6%
442272
1.9%

arrival_date_day_of_month
Real number (ℝ≥0)

Distinct31
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean15.79824106
Minimum1
Maximum31
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size932.9 KiB
2022-07-25T08:16:52.611492image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile2
Q18
median16
Q323
95-th percentile30
Maximum31
Range30
Interquartile range (IQR)15

Descriptive statistics

Standard deviation8.780829471
Coefficient of variation (CV)0.5558105765
Kurtosis-1.187168319
Mean15.79824106
Median Absolute Deviation (MAD)8
Skewness-0.002000453979
Sum1886152
Variance77.10296619
MonotonicityNot monotonic
2022-07-25T08:16:52.677278image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=31)
ValueCountFrequency (%)
174406
 
3.7%
54317
 
3.6%
154196
 
3.5%
254160
 
3.5%
264147
 
3.5%
94096
 
3.4%
124087
 
3.4%
164078
 
3.4%
24055
 
3.4%
194052
 
3.4%
Other values (21)77796
65.2%
ValueCountFrequency (%)
13626
3.0%
24055
3.4%
33855
3.2%
43763
3.2%
54317
3.6%
63833
3.2%
73665
3.1%
83921
3.3%
94096
3.4%
103575
3.0%
ValueCountFrequency (%)
312208
1.8%
303853
3.2%
293580
3.0%
283946
3.3%
273802
3.2%
264147
3.5%
254160
3.5%
243993
3.3%
233616
3.0%
223596
3.0%

stays_in_weekend_nights
Real number (ℝ≥0)

HIGH CORRELATION
ZEROS

Distinct17
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.9275986264
Minimum0
Maximum19
Zeros51998
Zeros (%)43.6%
Negative0
Negative (%)0.0%
Memory size932.9 KiB
2022-07-25T08:16:52.744048image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median1
Q32
95-th percentile2
Maximum19
Range19
Interquartile range (IQR)2

Descriptive statistics

Standard deviation0.9986134946
Coefficient of variation (CV)1.076557755
Kurtosis7.174066064
Mean0.9275986264
Median Absolute Deviation (MAD)1
Skewness1.38004645
Sum110746
Variance0.9972289116
MonotonicityNot monotonic
2022-07-25T08:16:52.805842image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=17)
ValueCountFrequency (%)
051998
43.6%
233308
27.9%
130626
25.7%
41855
 
1.6%
31259
 
1.1%
6153
 
0.1%
579
 
0.1%
860
 
0.1%
719
 
< 0.1%
911
 
< 0.1%
Other values (7)22
 
< 0.1%
ValueCountFrequency (%)
051998
43.6%
130626
25.7%
233308
27.9%
31259
 
1.1%
41855
 
1.6%
579
 
0.1%
6153
 
0.1%
719
 
< 0.1%
860
 
0.1%
911
 
< 0.1%
ValueCountFrequency (%)
191
 
< 0.1%
181
 
< 0.1%
163
 
< 0.1%
142
 
< 0.1%
133
 
< 0.1%
125
 
< 0.1%
107
 
< 0.1%
911
 
< 0.1%
860
0.1%
719
 
< 0.1%

stays_in_week_nights
Real number (ℝ≥0)

HIGH CORRELATION
ZEROS

Distinct35
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.500301533
Minimum0
Maximum50
Zeros7645
Zeros (%)6.4%
Negative0
Negative (%)0.0%
Memory size932.9 KiB
2022-07-25T08:16:52.879595image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q11
median2
Q33
95-th percentile5
Maximum50
Range50
Interquartile range (IQR)2

Descriptive statistics

Standard deviation1.908285615
Coefficient of variation (CV)0.7632221914
Kurtosis24.28455482
Mean2.500301533
Median Absolute Deviation (MAD)1
Skewness2.862249242
Sum298511
Variance3.641553989
MonotonicityNot monotonic
2022-07-25T08:16:52.955845image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=35)
ValueCountFrequency (%)
233684
28.2%
130310
25.4%
322258
18.6%
511077
 
9.3%
49563
 
8.0%
07645
 
6.4%
61499
 
1.3%
101036
 
0.9%
71029
 
0.9%
8656
 
0.5%
Other values (25)633
 
0.5%
ValueCountFrequency (%)
07645
 
6.4%
130310
25.4%
233684
28.2%
322258
18.6%
49563
 
8.0%
511077
 
9.3%
61499
 
1.3%
71029
 
0.9%
8656
 
0.5%
9231
 
0.2%
ValueCountFrequency (%)
501
 
< 0.1%
421
 
< 0.1%
411
 
< 0.1%
402
 
< 0.1%
351
 
< 0.1%
341
 
< 0.1%
331
 
< 0.1%
321
 
< 0.1%
305
< 0.1%
261
 
< 0.1%

adults
Real number (ℝ≥0)

Distinct14
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.856403384
Minimum0
Maximum55
Zeros403
Zeros (%)0.3%
Negative0
Negative (%)0.0%
Memory size932.9 KiB
2022-07-25T08:16:53.022622image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1
Q12
median2
Q32
95-th percentile3
Maximum55
Range55
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.5792609988
Coefficient of variation (CV)0.3120340137
Kurtosis1352.115116
Mean1.856403384
Median Absolute Deviation (MAD)0
Skewness18.31780476
Sum221636
Variance0.3355433048
MonotonicityNot monotonic
2022-07-25T08:16:53.077451image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=14)
ValueCountFrequency (%)
289680
75.1%
123027
 
19.3%
36202
 
5.2%
0403
 
0.3%
462
 
0.1%
265
 
< 0.1%
272
 
< 0.1%
202
 
< 0.1%
52
 
< 0.1%
401
 
< 0.1%
Other values (4)4
 
< 0.1%
ValueCountFrequency (%)
0403
 
0.3%
123027
 
19.3%
289680
75.1%
36202
 
5.2%
462
 
0.1%
52
 
< 0.1%
61
 
< 0.1%
101
 
< 0.1%
202
 
< 0.1%
265
 
< 0.1%
ValueCountFrequency (%)
551
 
< 0.1%
501
 
< 0.1%
401
 
< 0.1%
272
 
< 0.1%
265
 
< 0.1%
202
 
< 0.1%
101
 
< 0.1%
61
 
< 0.1%
52
 
< 0.1%
462
0.1%

children
Categorical

HIGH CORRELATION

Distinct5
Distinct (%)< 0.1%
Missing4
Missing (%)< 0.1%
Memory size932.9 KiB
0.0
110796 
1.0
 
4861
2.0
 
3652
3.0
 
76
10.0
 
1

Length

Max length4
Median length3
Mean length3.000008376
Min length3

Characters and Unicode

Total characters358159
Distinct characters5
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st row0.0
2nd row0.0
3rd row0.0
4th row0.0
5th row0.0

Common Values

ValueCountFrequency (%)
0.0110796
92.8%
1.04861
 
4.1%
2.03652
 
3.1%
3.076
 
0.1%
10.01
 
< 0.1%
(Missing)4
 
< 0.1%

Length

2022-07-25T08:16:53.143733image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-07-25T08:16:53.212503image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
0.0110796
92.8%
1.04861
 
4.1%
2.03652
 
3.1%
3.076
 
0.1%
10.01
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
0230183
64.3%
.119386
33.3%
14862
 
1.4%
23652
 
1.0%
376
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number238773
66.7%
Other Punctuation119386
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0230183
96.4%
14862
 
2.0%
23652
 
1.5%
376
 
< 0.1%
Other Punctuation
ValueCountFrequency (%)
.119386
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common358159
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0230183
64.3%
.119386
33.3%
14862
 
1.4%
23652
 
1.0%
376
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII358159
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0230183
64.3%
.119386
33.3%
14862
 
1.4%
23652
 
1.0%
376
 
< 0.1%

babies
Categorical

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size932.9 KiB
0
118473 
1
 
900
2
 
15
9
 
1
10
 
1

Length

Max length2
Median length1
Mean length1.000008376
Min length1

Characters and Unicode

Total characters119391
Distinct characters4
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)< 0.1%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0118473
99.2%
1900
 
0.8%
215
 
< 0.1%
91
 
< 0.1%
101
 
< 0.1%

Length

2022-07-25T08:16:53.274801image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-07-25T08:16:53.343117image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
0118473
99.2%
1900
 
0.8%
215
 
< 0.1%
91
 
< 0.1%
101
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
0118474
99.2%
1901
 
0.8%
215
 
< 0.1%
91
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number119391
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0118474
99.2%
1901
 
0.8%
215
 
< 0.1%
91
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Common119391
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0118474
99.2%
1901
 
0.8%
215
 
< 0.1%
91
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII119391
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0118474
99.2%
1901
 
0.8%
215
 
< 0.1%
91
 
< 0.1%

meal
Categorical

Distinct4
Distinct (%)< 0.1%
Missing1169
Missing (%)1.0%
Memory size932.9 KiB
BB
92310 
HB
14463 
SC
10650 
FB
 
798

Length

Max length2
Median length2
Mean length2
Min length2

Characters and Unicode

Total characters236442
Distinct characters5
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowBB
2nd rowBB
3rd rowBB
4th rowBB
5th rowBB

Common Values

ValueCountFrequency (%)
BB92310
77.3%
HB14463
 
12.1%
SC10650
 
8.9%
FB798
 
0.7%
(Missing)1169
 
1.0%

Length

2022-07-25T08:16:53.402917image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-07-25T08:16:53.469693image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
bb92310
78.1%
hb14463
 
12.2%
sc10650
 
9.0%
fb798
 
0.7%

Most occurring characters

ValueCountFrequency (%)
B199881
84.5%
H14463
 
6.1%
S10650
 
4.5%
C10650
 
4.5%
F798
 
0.3%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter236442
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
B199881
84.5%
H14463
 
6.1%
S10650
 
4.5%
C10650
 
4.5%
F798
 
0.3%

Most occurring scripts

ValueCountFrequency (%)
Latin236442
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
B199881
84.5%
H14463
 
6.1%
S10650
 
4.5%
C10650
 
4.5%
F798
 
0.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII236442
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
B199881
84.5%
H14463
 
6.1%
S10650
 
4.5%
C10650
 
4.5%
F798
 
0.3%

country
Categorical

HIGH CARDINALITY

Distinct177
Distinct (%)0.1%
Missing488
Missing (%)0.4%
Memory size932.9 KiB
PRT
48590 
GBR
12129 
FRA
10415 
ESP
8568 
DEU
7287 
Other values (172)
31913 

Length

Max length3
Median length3
Mean length2.989243242
Min length2

Characters and Unicode

Total characters355427
Distinct characters26
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique30 ?
Unique (%)< 0.1%

Sample

1st rowPRT
2nd rowPRT
3rd rowPRT
4th rowPRT
5th rowPRT

Common Values

ValueCountFrequency (%)
PRT48590
40.7%
GBR12129
 
10.2%
FRA10415
 
8.7%
ESP8568
 
7.2%
DEU7287
 
6.1%
ITA3766
 
3.2%
IRL3375
 
2.8%
BEL2342
 
2.0%
BRA2224
 
1.9%
NLD2104
 
1.8%
Other values (167)18102
 
15.2%

Length

2022-07-25T08:16:53.531487image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
prt48590
40.9%
gbr12129
 
10.2%
fra10415
 
8.8%
esp8568
 
7.2%
deu7287
 
6.1%
ita3766
 
3.2%
irl3375
 
2.8%
bel2342
 
2.0%
bra2224
 
1.9%
nld2104
 
1.8%
Other values (167)18102
 
15.2%

Most occurring characters

ValueCountFrequency (%)
R80804
22.7%
P58506
16.5%
T54263
15.3%
A21627
 
6.1%
E21538
 
6.1%
B17051
 
4.8%
S13931
 
3.9%
U13293
 
3.7%
G13130
 
3.7%
F10956
 
3.1%
Other values (16)50328
14.2%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter355427
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
R80804
22.7%
P58506
16.5%
T54263
15.3%
A21627
 
6.1%
E21538
 
6.1%
B17051
 
4.8%
S13931
 
3.9%
U13293
 
3.7%
G13130
 
3.7%
F10956
 
3.1%
Other values (16)50328
14.2%

Most occurring scripts

ValueCountFrequency (%)
Latin355427
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
R80804
22.7%
P58506
16.5%
T54263
15.3%
A21627
 
6.1%
E21538
 
6.1%
B17051
 
4.8%
S13931
 
3.9%
U13293
 
3.7%
G13130
 
3.7%
F10956
 
3.1%
Other values (16)50328
14.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII355427
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
R80804
22.7%
P58506
16.5%
T54263
15.3%
A21627
 
6.1%
E21538
 
6.1%
B17051
 
4.8%
S13931
 
3.9%
U13293
 
3.7%
G13130
 
3.7%
F10956
 
3.1%
Other values (16)50328
14.2%

market_segment
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct7
Distinct (%)< 0.1%
Missing2
Missing (%)< 0.1%
Memory size932.9 KiB
Online TA
56477 
Offline TA/TO
24219 
Groups
19811 
Direct
12606 
Corporate
 
5295
Other values (2)
 
980

Length

Max length13
Median length9
Mean length9.019767481
Min length6

Characters and Unicode

Total characters1076852
Distinct characters24
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowGroups
2nd rowGroups
3rd rowGroups
4th rowGroups
5th rowGroups

Common Values

ValueCountFrequency (%)
Online TA56477
47.3%
Offline TA/TO24219
20.3%
Groups19811
 
16.6%
Direct12606
 
10.6%
Corporate5295
 
4.4%
Complementary743
 
0.6%
Aviation237
 
0.2%
(Missing)2
 
< 0.1%

Length

2022-07-25T08:16:53.599765image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-07-25T08:16:53.677504image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
online56477
28.2%
ta56477
28.2%
offline24219
12.1%
ta/to24219
12.1%
groups19811
 
9.9%
direct12606
 
6.3%
corporate5295
 
2.6%
complementary743
 
0.4%
aviation237
 
0.1%

Most occurring characters

ValueCountFrequency (%)
n138153
12.8%
O104915
9.7%
T104915
9.7%
e100083
9.3%
i93776
8.7%
l81439
7.6%
A80933
7.5%
80696
7.5%
f48438
 
4.5%
r43750
 
4.1%
Other values (14)199754
18.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter642719
59.7%
Uppercase Letter329218
30.6%
Space Separator80696
 
7.5%
Other Punctuation24219
 
2.2%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
n138153
21.5%
e100083
15.6%
i93776
14.6%
l81439
12.7%
f48438
 
7.5%
r43750
 
6.8%
o31381
 
4.9%
p25849
 
4.0%
u19811
 
3.1%
s19811
 
3.1%
Other values (6)40228
 
6.3%
Uppercase Letter
ValueCountFrequency (%)
O104915
31.9%
T104915
31.9%
A80933
24.6%
G19811
 
6.0%
D12606
 
3.8%
C6038
 
1.8%
Space Separator
ValueCountFrequency (%)
80696
100.0%
Other Punctuation
ValueCountFrequency (%)
/24219
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin971937
90.3%
Common104915
 
9.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
n138153
14.2%
O104915
10.8%
T104915
10.8%
e100083
10.3%
i93776
9.6%
l81439
8.4%
A80933
8.3%
f48438
 
5.0%
r43750
 
4.5%
o31381
 
3.2%
Other values (12)144154
14.8%
Common
ValueCountFrequency (%)
80696
76.9%
/24219
 
23.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII1076852
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
n138153
12.8%
O104915
9.7%
T104915
9.7%
e100083
9.3%
i93776
8.7%
l81439
7.6%
A80933
7.5%
80696
7.5%
f48438
 
4.5%
r43750
 
4.1%
Other values (14)199754
18.5%

distribution_channel
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct4
Distinct (%)< 0.1%
Missing5
Missing (%)< 0.1%
Memory size932.9 KiB
TA/TO
97870 
Direct
14645 
Corporate
 
6677
GDS
 
193

Length

Max length9
Median length5
Mean length5.343150312
Min length3

Characters and Unicode

Total characters637892
Distinct characters16
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowTA/TO
2nd rowTA/TO
3rd rowTA/TO
4th rowTA/TO
5th rowTA/TO

Common Values

ValueCountFrequency (%)
TA/TO97870
82.0%
Direct14645
 
12.3%
Corporate6677
 
5.6%
GDS193
 
0.2%
(Missing)5
 
< 0.1%

Length

2022-07-25T08:16:53.752759image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-07-25T08:16:53.819535image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
ta/to97870
82.0%
direct14645
 
12.3%
corporate6677
 
5.6%
gds193
 
0.2%

Most occurring characters

ValueCountFrequency (%)
T195740
30.7%
A97870
15.3%
/97870
15.3%
O97870
15.3%
r27999
 
4.4%
e21322
 
3.3%
t21322
 
3.3%
D14838
 
2.3%
i14645
 
2.3%
c14645
 
2.3%
Other values (6)33771
 
5.3%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter413381
64.8%
Lowercase Letter126641
 
19.9%
Other Punctuation97870
 
15.3%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
r27999
22.1%
e21322
16.8%
t21322
16.8%
i14645
11.6%
c14645
11.6%
o13354
10.5%
p6677
 
5.3%
a6677
 
5.3%
Uppercase Letter
ValueCountFrequency (%)
T195740
47.4%
A97870
23.7%
O97870
23.7%
D14838
 
3.6%
C6677
 
1.6%
G193
 
< 0.1%
S193
 
< 0.1%
Other Punctuation
ValueCountFrequency (%)
/97870
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin540022
84.7%
Common97870
 
15.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
T195740
36.2%
A97870
18.1%
O97870
18.1%
r27999
 
5.2%
e21322
 
3.9%
t21322
 
3.9%
D14838
 
2.7%
i14645
 
2.7%
c14645
 
2.7%
o13354
 
2.5%
Other values (5)20417
 
3.8%
Common
ValueCountFrequency (%)
/97870
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII637892
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
T195740
30.7%
A97870
15.3%
/97870
15.3%
O97870
15.3%
r27999
 
4.4%
e21322
 
3.3%
t21322
 
3.3%
D14838
 
2.3%
i14645
 
2.3%
c14645
 
2.3%
Other values (6)33771
 
5.3%

is_repeated_guest
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size932.9 KiB
0
115580 
1
 
3810

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters119390
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0115580
96.8%
13810
 
3.2%

Length

2022-07-25T08:16:53.880836image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-07-25T08:16:53.942629image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
0115580
96.8%
13810
 
3.2%

Most occurring characters

ValueCountFrequency (%)
0115580
96.8%
13810
 
3.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number119390
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0115580
96.8%
13810
 
3.2%

Most occurring scripts

ValueCountFrequency (%)
Common119390
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0115580
96.8%
13810
 
3.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII119390
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0115580
96.8%
13810
 
3.2%

previous_cancellations
Real number (ℝ≥0)

HIGH CORRELATION
SKEWED
ZEROS

Distinct15
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.08711784907
Minimum0
Maximum26
Zeros112906
Zeros (%)94.6%
Negative0
Negative (%)0.0%
Memory size932.9 KiB
2022-07-25T08:16:53.990973image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile1
Maximum26
Range26
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.8443363842
Coefficient of variation (CV)9.691887405
Kurtosis674.0736926
Mean0.08711784907
Median Absolute Deviation (MAD)0
Skewness24.45804872
Sum10401
Variance0.7129039296
MonotonicityNot monotonic
2022-07-25T08:16:54.053763image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=15)
ValueCountFrequency (%)
0112906
94.6%
16051
 
5.1%
2116
 
0.1%
365
 
0.1%
2448
 
< 0.1%
1135
 
< 0.1%
431
 
< 0.1%
2626
 
< 0.1%
2525
 
< 0.1%
622
 
< 0.1%
Other values (5)65
 
0.1%
ValueCountFrequency (%)
0112906
94.6%
16051
 
5.1%
2116
 
0.1%
365
 
0.1%
431
 
< 0.1%
519
 
< 0.1%
622
 
< 0.1%
1135
 
< 0.1%
1312
 
< 0.1%
1414
 
< 0.1%
ValueCountFrequency (%)
2626
< 0.1%
2525
< 0.1%
2448
< 0.1%
211
 
< 0.1%
1919
 
< 0.1%
1414
 
< 0.1%
1312
 
< 0.1%
1135
< 0.1%
622
< 0.1%
519
 
< 0.1%

previous_bookings_not_canceled
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
SKEWED
ZEROS

Distinct73
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.1370969093
Minimum0
Maximum72
Zeros115770
Zeros (%)97.0%
Negative0
Negative (%)0.0%
Memory size932.9 KiB
2022-07-25T08:16:54.128513image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile0
Maximum72
Range72
Interquartile range (IQR)0

Descriptive statistics

Standard deviation1.497436848
Coefficient of variation (CV)10.92246977
Kurtosis767.2452097
Mean0.1370969093
Median Absolute Deviation (MAD)0
Skewness23.53979995
Sum16368
Variance2.242317113
MonotonicityNot monotonic
2022-07-25T08:16:54.210744image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0115770
97.0%
11542
 
1.3%
2580
 
0.5%
3333
 
0.3%
4229
 
0.2%
5181
 
0.2%
6115
 
0.1%
788
 
0.1%
870
 
0.1%
960
 
0.1%
Other values (63)422
 
0.4%
ValueCountFrequency (%)
0115770
97.0%
11542
 
1.3%
2580
 
0.5%
3333
 
0.3%
4229
 
0.2%
5181
 
0.2%
6115
 
0.1%
788
 
0.1%
870
 
0.1%
960
 
0.1%
ValueCountFrequency (%)
721
< 0.1%
711
< 0.1%
701
< 0.1%
691
< 0.1%
681
< 0.1%
671
< 0.1%
661
< 0.1%
651
< 0.1%
641
< 0.1%
631
< 0.1%

reserved_room_type
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct10
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size932.9 KiB
A
85994 
D
19201 
E
 
6535
F
 
2897
G
 
2094
Other values (5)
 
2669

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters119390
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowA
2nd rowA
3rd rowA
4th rowA
5th rowA

Common Values

ValueCountFrequency (%)
A85994
72.0%
D19201
 
16.1%
E6535
 
5.5%
F2897
 
2.4%
G2094
 
1.8%
B1118
 
0.9%
C932
 
0.8%
H601
 
0.5%
P12
 
< 0.1%
L6
 
< 0.1%

Length

2022-07-25T08:16:54.288484image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-07-25T08:16:54.363738image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
a85994
72.0%
d19201
 
16.1%
e6535
 
5.5%
f2897
 
2.4%
g2094
 
1.8%
b1118
 
0.9%
c932
 
0.8%
h601
 
0.5%
p12
 
< 0.1%
l6
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
A85994
72.0%
D19201
 
16.1%
E6535
 
5.5%
F2897
 
2.4%
G2094
 
1.8%
B1118
 
0.9%
C932
 
0.8%
H601
 
0.5%
P12
 
< 0.1%
L6
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter119390
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A85994
72.0%
D19201
 
16.1%
E6535
 
5.5%
F2897
 
2.4%
G2094
 
1.8%
B1118
 
0.9%
C932
 
0.8%
H601
 
0.5%
P12
 
< 0.1%
L6
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Latin119390
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
A85994
72.0%
D19201
 
16.1%
E6535
 
5.5%
F2897
 
2.4%
G2094
 
1.8%
B1118
 
0.9%
C932
 
0.8%
H601
 
0.5%
P12
 
< 0.1%
L6
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII119390
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
A85994
72.0%
D19201
 
16.1%
E6535
 
5.5%
F2897
 
2.4%
G2094
 
1.8%
B1118
 
0.9%
C932
 
0.8%
H601
 
0.5%
P12
 
< 0.1%
L6
 
< 0.1%

assigned_room_type
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct12
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size932.9 KiB
A
74053 
D
25322 
E
7806 
F
 
3751
G
 
2553
Other values (7)
 
5905

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters119390
Distinct characters12
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st rowA
2nd rowA
3rd rowA
4th rowA
5th rowA

Common Values

ValueCountFrequency (%)
A74053
62.0%
D25322
 
21.2%
E7806
 
6.5%
F3751
 
3.1%
G2553
 
2.1%
C2375
 
2.0%
B2163
 
1.8%
H712
 
0.6%
I363
 
0.3%
K279
 
0.2%
Other values (2)13
 
< 0.1%

Length

2022-07-25T08:16:54.437492image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
a74053
62.0%
d25322
 
21.2%
e7806
 
6.5%
f3751
 
3.1%
g2553
 
2.1%
c2375
 
2.0%
b2163
 
1.8%
h712
 
0.6%
i363
 
0.3%
k279
 
0.2%
Other values (2)13
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
A74053
62.0%
D25322
 
21.2%
E7806
 
6.5%
F3751
 
3.1%
G2553
 
2.1%
C2375
 
2.0%
B2163
 
1.8%
H712
 
0.6%
I363
 
0.3%
K279
 
0.2%
Other values (2)13
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter119390
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A74053
62.0%
D25322
 
21.2%
E7806
 
6.5%
F3751
 
3.1%
G2553
 
2.1%
C2375
 
2.0%
B2163
 
1.8%
H712
 
0.6%
I363
 
0.3%
K279
 
0.2%
Other values (2)13
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Latin119390
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
A74053
62.0%
D25322
 
21.2%
E7806
 
6.5%
F3751
 
3.1%
G2553
 
2.1%
C2375
 
2.0%
B2163
 
1.8%
H712
 
0.6%
I363
 
0.3%
K279
 
0.2%
Other values (2)13
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII119390
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
A74053
62.0%
D25322
 
21.2%
E7806
 
6.5%
F3751
 
3.1%
G2553
 
2.1%
C2375
 
2.0%
B2163
 
1.8%
H712
 
0.6%
I363
 
0.3%
K279
 
0.2%
Other values (2)13
 
< 0.1%

booking_changes
Real number (ℝ≥0)

HIGH CORRELATION
ZEROS

Distinct21
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.2211240472
Minimum0
Maximum21
Zeros101314
Zeros (%)84.9%
Negative0
Negative (%)0.0%
Memory size932.9 KiB
2022-07-25T08:16:54.499812image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile1
Maximum21
Range21
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.6523055727
Coefficient of variation (CV)2.949953118
Kurtosis79.39360467
Mean0.2211240472
Median Absolute Deviation (MAD)0
Skewness6.000270054
Sum26400
Variance0.4255025601
MonotonicityNot monotonic
2022-07-25T08:16:55.069971image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=21)
ValueCountFrequency (%)
0101314
84.9%
112701
 
10.6%
23805
 
3.2%
3927
 
0.8%
4376
 
0.3%
5118
 
0.1%
663
 
0.1%
731
 
< 0.1%
817
 
< 0.1%
98
 
< 0.1%
Other values (11)30
 
< 0.1%
ValueCountFrequency (%)
0101314
84.9%
112701
 
10.6%
23805
 
3.2%
3927
 
0.8%
4376
 
0.3%
5118
 
0.1%
663
 
0.1%
731
 
< 0.1%
817
 
< 0.1%
98
 
< 0.1%
ValueCountFrequency (%)
211
 
< 0.1%
201
 
< 0.1%
181
 
< 0.1%
172
 
< 0.1%
162
 
< 0.1%
153
< 0.1%
145
< 0.1%
135
< 0.1%
122
 
< 0.1%
112
 
< 0.1%

deposit_type
Categorical

HIGH CORRELATION

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size932.9 KiB
No Deposit
104641 
Non Refund
14587 
Refundable
 
162

Length

Max length10
Median length10
Mean length10
Min length10

Characters and Unicode

Total characters1193900
Distinct characters17
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNon Refund
2nd rowNon Refund
3rd rowNon Refund
4th rowNon Refund
5th rowNon Refund

Common Values

ValueCountFrequency (%)
No Deposit104641
87.6%
Non Refund14587
 
12.2%
Refundable162
 
0.1%

Length

2022-07-25T08:16:55.133763image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-07-25T08:16:55.198589image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
no104641
43.9%
deposit104641
43.9%
non14587
 
6.1%
refund14587
 
6.1%
refundable162
 
0.1%

Most occurring characters

ValueCountFrequency (%)
o223869
18.8%
e119552
10.0%
N119228
10.0%
119228
10.0%
s104641
8.8%
i104641
8.8%
t104641
8.8%
p104641
8.8%
D104641
8.8%
n29336
 
2.5%
Other values (7)59482
 
5.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter836054
70.0%
Uppercase Letter238618
 
20.0%
Space Separator119228
 
10.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o223869
26.8%
e119552
14.3%
s104641
12.5%
i104641
12.5%
t104641
12.5%
p104641
12.5%
n29336
 
3.5%
f14749
 
1.8%
u14749
 
1.8%
d14749
 
1.8%
Other values (3)486
 
0.1%
Uppercase Letter
ValueCountFrequency (%)
N119228
50.0%
D104641
43.9%
R14749
 
6.2%
Space Separator
ValueCountFrequency (%)
119228
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin1074672
90.0%
Common119228
 
10.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
o223869
20.8%
e119552
11.1%
N119228
11.1%
s104641
9.7%
i104641
9.7%
t104641
9.7%
p104641
9.7%
D104641
9.7%
n29336
 
2.7%
R14749
 
1.4%
Other values (6)44733
 
4.2%
Common
ValueCountFrequency (%)
119228
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII1193900
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
o223869
18.8%
e119552
10.0%
N119228
10.0%
119228
10.0%
s104641
8.8%
i104641
8.8%
t104641
8.8%
p104641
8.8%
D104641
8.8%
n29336
 
2.5%
Other values (7)59482
 
5.0%

agent
Real number (ℝ≥0)

HIGH CORRELATION
MISSING

Distinct333
Distinct (%)0.3%
Missing16340
Missing (%)13.7%
Infinite0
Infinite (%)0.0%
Mean86.69338185
Minimum1
Maximum535
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size932.9 KiB
2022-07-25T08:16:55.261858image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q19
median14
Q3229
95-th percentile250
Maximum535
Range534
Interquartile range (IQR)220

Descriptive statistics

Standard deviation110.7745476
Coefficient of variation (CV)1.277773981
Kurtosis-0.007179564938
Mean86.69338185
Median Absolute Deviation (MAD)13
Skewness1.089385636
Sum8933753
Variance12271.00041
MonotonicityNot monotonic
2022-07-25T08:16:55.343884image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
931961
26.8%
24013922
11.7%
17191
 
6.0%
143640
 
3.0%
73539
 
3.0%
63290
 
2.8%
2502870
 
2.4%
2411721
 
1.4%
281666
 
1.4%
81514
 
1.3%
Other values (323)31736
26.6%
(Missing)16340
13.7%
ValueCountFrequency (%)
17191
 
6.0%
2162
 
0.1%
31336
 
1.1%
447
 
< 0.1%
5330
 
0.3%
63290
 
2.8%
73539
 
3.0%
81514
 
1.3%
931961
26.8%
10260
 
0.2%
ValueCountFrequency (%)
5353
 
< 0.1%
53168
0.1%
52735
< 0.1%
52610
 
< 0.1%
5102
 
< 0.1%
50910
 
< 0.1%
5086
 
< 0.1%
50224
 
< 0.1%
4971
 
< 0.1%
49557
< 0.1%

company
Real number (ℝ≥0)

HIGH CORRELATION
MISSING

Distinct352
Distinct (%)5.2%
Missing112593
Missing (%)94.3%
Infinite0
Infinite (%)0.0%
Mean189.2667353
Minimum6
Maximum543
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size932.9 KiB
2022-07-25T08:16:55.424640image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum6
5-th percentile40
Q162
median179
Q3270
95-th percentile435
Maximum543
Range537
Interquartile range (IQR)208

Descriptive statistics

Standard deviation131.6550146
Coefficient of variation (CV)0.6956056721
Kurtosis-0.4907952103
Mean189.2667353
Median Absolute Deviation (MAD)111
Skewness0.6015996673
Sum1286446
Variance17333.04288
MonotonicityNot monotonic
2022-07-25T08:16:55.510755image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
40927
 
0.8%
223784
 
0.7%
67267
 
0.2%
45250
 
0.2%
153215
 
0.2%
174149
 
0.1%
219141
 
0.1%
281138
 
0.1%
154133
 
0.1%
405119
 
0.1%
Other values (342)3674
 
3.1%
(Missing)112593
94.3%
ValueCountFrequency (%)
61
 
< 0.1%
81
 
< 0.1%
937
< 0.1%
101
 
< 0.1%
111
 
< 0.1%
1214
 
< 0.1%
149
 
< 0.1%
165
 
< 0.1%
181
 
< 0.1%
2050
< 0.1%
ValueCountFrequency (%)
5432
 
< 0.1%
5411
 
< 0.1%
5392
 
< 0.1%
5342
 
< 0.1%
5311
 
< 0.1%
5305
 
< 0.1%
5282
 
< 0.1%
52515
< 0.1%
52319
< 0.1%
5217
 
< 0.1%

days_in_waiting_list
Real number (ℝ≥0)

ZEROS

Distinct128
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.321149175
Minimum0
Maximum391
Zeros115692
Zeros (%)96.9%
Negative0
Negative (%)0.0%
Memory size932.9 KiB
2022-07-25T08:16:55.595016image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile0
Maximum391
Range391
Interquartile range (IQR)0

Descriptive statistics

Standard deviation17.59472088
Coefficient of variation (CV)7.580176694
Kurtosis186.7930696
Mean2.321149175
Median Absolute Deviation (MAD)0
Skewness11.94435345
Sum277122
Variance309.5742028
MonotonicityNot monotonic
2022-07-25T08:16:55.676918image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0115692
96.9%
39227
 
0.2%
58164
 
0.1%
44141
 
0.1%
31127
 
0.1%
3596
 
0.1%
4694
 
0.1%
6989
 
0.1%
6383
 
0.1%
5080
 
0.1%
Other values (118)2597
 
2.2%
ValueCountFrequency (%)
0115692
96.9%
112
 
< 0.1%
25
 
< 0.1%
359
 
< 0.1%
425
 
< 0.1%
58
 
< 0.1%
616
 
< 0.1%
74
 
< 0.1%
87
 
< 0.1%
916
 
< 0.1%
ValueCountFrequency (%)
39145
< 0.1%
37915
 
< 0.1%
33015
 
< 0.1%
25910
 
< 0.1%
23635
< 0.1%
22410
 
< 0.1%
22361
0.1%
21521
 
< 0.1%
20715
 
< 0.1%
1931
 
< 0.1%

customer_type
Categorical

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size932.9 KiB
Transient
89613 
Transient-Party
25124 
Contract
 
4076
Group
 
577

Length

Max length15
Median length9
Mean length10.20914649
Min length5

Characters and Unicode

Total characters1218870
Distinct characters17
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowTransient
2nd rowTransient
3rd rowTransient
4th rowTransient
5th rowTransient

Common Values

ValueCountFrequency (%)
Transient89613
75.1%
Transient-Party25124
 
21.0%
Contract4076
 
3.4%
Group577
 
0.5%

Length

2022-07-25T08:16:55.753681image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-07-25T08:16:55.816475image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
transient89613
75.1%
transient-party25124
 
21.0%
contract4076
 
3.4%
group577
 
0.5%

Most occurring characters

ValueCountFrequency (%)
n233550
19.2%
t148013
12.1%
r144514
11.9%
a143937
11.8%
T114737
9.4%
s114737
9.4%
i114737
9.4%
e114737
9.4%
y25124
 
2.1%
-25124
 
2.1%
Other values (7)39660
 
3.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter1049232
86.1%
Uppercase Letter144514
 
11.9%
Dash Punctuation25124
 
2.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
n233550
22.3%
t148013
14.1%
r144514
13.8%
a143937
13.7%
s114737
10.9%
i114737
10.9%
e114737
10.9%
y25124
 
2.4%
o4653
 
0.4%
c4076
 
0.4%
Other values (2)1154
 
0.1%
Uppercase Letter
ValueCountFrequency (%)
T114737
79.4%
P25124
 
17.4%
C4076
 
2.8%
G577
 
0.4%
Dash Punctuation
ValueCountFrequency (%)
-25124
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin1193746
97.9%
Common25124
 
2.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
n233550
19.6%
t148013
12.4%
r144514
12.1%
a143937
12.1%
T114737
9.6%
s114737
9.6%
i114737
9.6%
e114737
9.6%
y25124
 
2.1%
P25124
 
2.1%
Other values (6)14536
 
1.2%
Common
ValueCountFrequency (%)
-25124
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII1218870
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
n233550
19.2%
t148013
12.1%
r144514
11.9%
a143937
11.8%
T114737
9.4%
s114737
9.4%
i114737
9.4%
e114737
9.4%
y25124
 
2.1%
-25124
 
2.1%
Other values (7)39660
 
3.3%

adr
Real number (ℝ)

ZEROS

Distinct8879
Distinct (%)7.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean101.8311215
Minimum-6.38
Maximum5400
Zeros1959
Zeros (%)1.6%
Negative1
Negative (%)< 0.1%
Memory size932.9 KiB
2022-07-25T08:16:55.886760image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum-6.38
5-th percentile38.4
Q169.29
median94.575
Q3126
95-th percentile193.5
Maximum5400
Range5406.38
Interquartile range (IQR)56.71

Descriptive statistics

Standard deviation50.53579029
Coefficient of variation (CV)0.4962705853
Kurtosis1013.189851
Mean101.8311215
Median Absolute Deviation (MAD)27.825
Skewness10.53021398
Sum12157617.6
Variance2553.8661
MonotonicityNot monotonic
2022-07-25T08:16:55.969862image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
623754
 
3.1%
752715
 
2.3%
902473
 
2.1%
652418
 
2.0%
01959
 
1.6%
801889
 
1.6%
951661
 
1.4%
1201607
 
1.3%
1001573
 
1.3%
851538
 
1.3%
Other values (8869)97803
81.9%
ValueCountFrequency (%)
-6.381
 
< 0.1%
01959
1.6%
0.261
 
< 0.1%
0.51
 
< 0.1%
115
 
< 0.1%
1.291
 
< 0.1%
1.481
 
< 0.1%
1.562
 
< 0.1%
1.61
 
< 0.1%
1.81
 
< 0.1%
ValueCountFrequency (%)
54001
< 0.1%
5101
< 0.1%
5081
< 0.1%
451.51
< 0.1%
4501
< 0.1%
4371
< 0.1%
426.251
< 0.1%
4021
< 0.1%
397.381
< 0.1%
3922
< 0.1%
Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size932.9 KiB
0
111974 
1
 
7383
2
 
28
3
 
3
8
 
2

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters119390
Distinct characters5
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0111974
93.8%
17383
 
6.2%
228
 
< 0.1%
33
 
< 0.1%
82
 
< 0.1%

Length

2022-07-25T08:16:56.081704image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-07-25T08:16:56.168141image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
0111974
93.8%
17383
 
6.2%
228
 
< 0.1%
33
 
< 0.1%
82
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
0111974
93.8%
17383
 
6.2%
228
 
< 0.1%
33
 
< 0.1%
82
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number119390
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0111974
93.8%
17383
 
6.2%
228
 
< 0.1%
33
 
< 0.1%
82
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Common119390
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0111974
93.8%
17383
 
6.2%
228
 
< 0.1%
33
 
< 0.1%
82
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII119390
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0111974
93.8%
17383
 
6.2%
228
 
< 0.1%
33
 
< 0.1%
82
 
< 0.1%

total_of_special_requests
Real number (ℝ≥0)

ZEROS

Distinct6
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.5713627607
Minimum0
Maximum5
Zeros70318
Zeros (%)58.9%
Negative0
Negative (%)0.0%
Memory size932.9 KiB
2022-07-25T08:16:56.231043image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q31
95-th percentile2
Maximum5
Range5
Interquartile range (IQR)1

Descriptive statistics

Standard deviation0.7927984228
Coefficient of variation (CV)1.387557043
Kurtosis1.492564811
Mean0.5713627607
Median Absolute Deviation (MAD)0
Skewness1.349189377
Sum68215
Variance0.6285293392
MonotonicityNot monotonic
2022-07-25T08:16:56.289825image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
070318
58.9%
133226
27.8%
212969
 
10.9%
32497
 
2.1%
4340
 
0.3%
540
 
< 0.1%
ValueCountFrequency (%)
070318
58.9%
133226
27.8%
212969
 
10.9%
32497
 
2.1%
4340
 
0.3%
540
 
< 0.1%
ValueCountFrequency (%)
540
 
< 0.1%
4340
 
0.3%
32497
 
2.1%
212969
 
10.9%
133226
27.8%
070318
58.9%

reservation_status
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size932.9 KiB
Check-Out
75166 
Canceled
43017 
No-Show
 
1207

Length

Max length9
Median length9
Mean length8.619473993
Min length7

Characters and Unicode

Total characters1029079
Distinct characters17
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowCanceled
2nd rowCanceled
3rd rowCanceled
4th rowCanceled
5th rowCanceled

Common Values

ValueCountFrequency (%)
Check-Out75166
63.0%
Canceled43017
36.0%
No-Show1207
 
1.0%

Length

2022-07-25T08:16:56.360604image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-07-25T08:16:56.434368image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
check-out75166
63.0%
canceled43017
36.0%
no-show1207
 
1.0%

Most occurring characters

ValueCountFrequency (%)
e161200
15.7%
C118183
11.5%
c118183
11.5%
h76373
7.4%
-76373
7.4%
u75166
7.3%
t75166
7.3%
O75166
7.3%
k75166
7.3%
a43017
 
4.2%
Other values (7)135086
13.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter756943
73.6%
Uppercase Letter195763
 
19.0%
Dash Punctuation76373
 
7.4%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e161200
21.3%
c118183
15.6%
h76373
10.1%
u75166
9.9%
t75166
9.9%
k75166
9.9%
a43017
 
5.7%
n43017
 
5.7%
l43017
 
5.7%
d43017
 
5.7%
Other values (2)3621
 
0.5%
Uppercase Letter
ValueCountFrequency (%)
C118183
60.4%
O75166
38.4%
N1207
 
0.6%
S1207
 
0.6%
Dash Punctuation
ValueCountFrequency (%)
-76373
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin952706
92.6%
Common76373
 
7.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
e161200
16.9%
C118183
12.4%
c118183
12.4%
h76373
8.0%
u75166
7.9%
t75166
7.9%
O75166
7.9%
k75166
7.9%
a43017
 
4.5%
n43017
 
4.5%
Other values (6)92069
9.7%
Common
ValueCountFrequency (%)
-76373
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII1029079
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e161200
15.7%
C118183
11.5%
c118183
11.5%
h76373
7.4%
-76373
7.4%
u75166
7.3%
t75166
7.3%
O75166
7.3%
k75166
7.3%
a43017
 
4.2%
Other values (7)135086
13.1%

reservation_status_date
Categorical

HIGH CARDINALITY

Distinct926
Distinct (%)0.8%
Missing0
Missing (%)0.0%
Memory size932.9 KiB
21/10/2015
 
1461
6/7/2015
 
805
25/11/2016
 
790
1/1/2015
 
763
18/01/2016
 
625
Other values (921)
114946 

Length

Max length10
Median length10
Mean length9.387536645
Min length8

Characters and Unicode

Total characters1120778
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique28 ?
Unique (%)< 0.1%

Sample

1st row21/10/2015
2nd row21/10/2015
3rd row21/10/2015
4th row21/10/2015
5th row21/10/2015

Common Values

ValueCountFrequency (%)
21/10/20151461
 
1.2%
6/7/2015805
 
0.7%
25/11/2016790
 
0.7%
1/1/2015763
 
0.6%
18/01/2016625
 
0.5%
2/7/2015469
 
0.4%
7/12/2016450
 
0.4%
18/12/2015423
 
0.4%
9/2/2016412
 
0.3%
4/4/2016382
 
0.3%
Other values (916)112810
94.5%

Length

2022-07-25T08:16:56.499164image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
21/10/20151461
 
1.2%
6/7/2015805
 
0.7%
25/11/2016790
 
0.7%
1/1/2015763
 
0.6%
18/01/2016625
 
0.5%
2/7/2015469
 
0.4%
7/12/2016450
 
0.4%
18/12/2015423
 
0.4%
9/2/2016412
 
0.3%
4/4/2016382
 
0.3%
Other values (916)112810
94.5%

Most occurring characters

ValueCountFrequency (%)
/238780
21.3%
1218578
19.5%
0196933
17.6%
2187927
16.8%
679165
 
7.1%
760096
 
5.4%
546838
 
4.2%
326867
 
2.4%
823119
 
2.1%
921359
 
1.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number881998
78.7%
Other Punctuation238780
 
21.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1218578
24.8%
0196933
22.3%
2187927
21.3%
679165
 
9.0%
760096
 
6.8%
546838
 
5.3%
326867
 
3.0%
823119
 
2.6%
921359
 
2.4%
421116
 
2.4%
Other Punctuation
ValueCountFrequency (%)
/238780
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common1120778
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
/238780
21.3%
1218578
19.5%
0196933
17.6%
2187927
16.8%
679165
 
7.1%
760096
 
5.4%
546838
 
4.2%
326867
 
2.4%
823119
 
2.1%
921359
 
1.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII1120778
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
/238780
21.3%
1218578
19.5%
0196933
17.6%
2187927
16.8%
679165
 
7.1%
760096
 
5.4%
546838
 
4.2%
326867
 
2.4%
823119
 
2.1%
921359
 
1.9%

Interactions

2022-07-25T08:16:48.066110image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:28.460357image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:29.946066image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:31.404857image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:32.805501image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:34.222344image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:35.654440image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:37.024765image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:38.379242image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:40.063199image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:41.447481image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:42.779929image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:44.148370image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:45.308079image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:46.719715image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:48.159796image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:28.573977image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:30.045781image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:31.506774image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:32.904171image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:34.321018image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:35.750121image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:37.118454image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:38.468941image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:40.156782image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:41.536184image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:42.875617image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:44.221126image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:45.412860image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:46.813606image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:48.255477image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:28.676170image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:30.144626image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:31.604945image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:33.001354image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:34.420680image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:35.846349image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:37.213545image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:38.555650image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:40.250480image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:41.626877image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:42.964313image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:44.302849image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:45.503997image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:46.903826image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:48.349163image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:28.780339image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:30.249782image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:31.697668image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:33.099028image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:34.516364image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:35.937045image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:37.306238image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:39.020866image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:40.342690image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:41.716593image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:43.058000image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:44.377603image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:45.601896image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:46.995534image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:48.447845image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:28.882519image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:30.354855image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:31.796812image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:33.194711image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:34.616031image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:36.027738image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:37.399472image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:39.108930image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:40.437377image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:41.806289image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:43.153685image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:44.455342image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:45.704799image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:47.087227image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:48.542524image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:28.982186image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:30.458042image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:31.895989image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:33.296363image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:34.718687image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:36.122425image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:37.495157image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:39.201784image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:40.534054image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:41.899508image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:43.259803image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:44.539063image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:45.810745image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:47.181910image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:48.629743image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:29.086832image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:30.555821image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:31.987684image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:33.396031image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:34.813365image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:36.211130image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:37.584852image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:39.288796image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:40.625748image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:41.990716image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:43.355489image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:44.625281image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:45.905216image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:47.272607image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:48.717081image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:29.176531image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:30.648555image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:32.078381image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:33.489284image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:34.900590image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:36.300830image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:37.668577image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:39.374410image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:40.718433image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:42.092886image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:43.459661image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:44.709005image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:45.996597image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:47.359318image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:48.807300image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:29.272236image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:30.739283image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:32.170587image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:33.583964image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:34.994277image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:36.390530image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:37.757280image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:39.460613image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:40.811604image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:42.177603image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:43.553344image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:44.787293image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:46.089499image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:47.453538image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:48.902980image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:29.381870image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:30.844478image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:32.265275image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:33.682638image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:35.095952image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:36.488197image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:37.851964image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:39.553334image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:40.907287image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:42.270605image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:43.644040image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:44.861048image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:46.184368image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:47.551208image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:48.988690image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:29.484017image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:30.940508image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:32.351985image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:33.771341image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:35.188643image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:36.576901image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:37.939666image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:39.637072image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:40.995991image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:42.353329image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:43.736746image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:44.927824image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:46.267815image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:47.636921image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:49.055988image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:29.560783image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:31.015293image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:32.418793image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:33.843609image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:35.259405image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:36.647191image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:38.008437image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:39.708853image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:41.070741image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:42.427591image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:43.804520image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:44.986623image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:46.338781image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:47.707688image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:49.147682image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:29.654487image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:31.108995image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:32.505500image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:33.936305image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:35.349259image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:36.739385image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:38.105116image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:39.797574image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:41.161434image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:42.513309image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:43.895216image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:45.061372image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:46.434772image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:47.794016image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:49.244435image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:29.755212image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:31.212789image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:32.605171image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:34.035971image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:35.463812image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:36.837061image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:38.200796image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:39.889940image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:41.260108image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:42.608989image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:43.990897image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:45.139112image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:46.532515image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:47.885713image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:49.334135image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:29.849895image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:31.307701image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:32.697861image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:34.129653image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:35.558760image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:36.927767image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:38.291535image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:39.976644image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:41.354791image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:42.694699image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:44.078603image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:45.211376image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:46.624677image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-07-25T08:16:47.971422image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Correlations

2022-07-25T08:16:56.579913image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2022-07-25T08:16:56.750365image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2022-07-25T08:16:56.924802image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2022-07-25T08:16:57.097257image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.
2022-07-25T08:16:57.248919image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

2022-07-25T08:16:49.602237image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
A simple visualization of nullity by column.
2022-07-25T08:16:50.450254image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2022-07-25T08:16:51.081789image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.
2022-07-25T08:16:51.342144image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
The dendrogram allows you to more fully correlate variable completion, revealing trends deeper than the pairwise ones visible in the correlation heatmap.

Sample

First rows

Unnamed: 0hotelis_canceledlead_timearrival_date_yeararrival_date_montharrival_date_week_numberarrival_date_day_of_monthstays_in_weekend_nightsstays_in_week_nightsadultschildrenbabiesmealcountrymarket_segmentdistribution_channelis_repeated_guestprevious_cancellationsprevious_bookings_not_canceledreserved_room_typeassigned_room_typebooking_changesdeposit_typeagentcompanydays_in_waiting_listcustomer_typeadrrequired_car_parking_spacestotal_of_special_requestsreservation_statusreservation_status_date
00City Hotel16152017March11160220.00BBPRTGroupsTA/TO000AA0Non Refund1.0NaN0Transient62.000Canceled21/10/2015
11City Hotel16152017March11160220.00BBPRTGroupsTA/TO000AA0Non Refund1.0NaN0Transient62.000Canceled21/10/2015
22City Hotel16152017March11160220.00BBPRTGroupsTA/TO000AA0Non Refund1.0NaN0Transient62.000Canceled21/10/2015
33City Hotel16152017March11160220.00BBPRTGroupsTA/TO000AA0Non Refund1.0NaN0Transient62.000Canceled21/10/2015
44City Hotel16152017March11160220.00BBPRTGroupsTA/TO000AA0Non Refund1.0NaN0Transient62.000Canceled21/10/2015
55City Hotel16152017March11160220.00BBPRTGroupsTA/TO000AA0Non Refund1.0NaN0Transient62.000Canceled21/10/2015
66City Hotel16152017March11160220.00BBPRTGroupsTA/TO000AA0Non Refund1.0NaN0Transient62.000Canceled21/10/2015
77City Hotel16152017March11160220.00BBPRTGroupsTA/TO000AA0Non Refund1.0NaN0Transient62.000Canceled21/10/2015
88City Hotel16152017March11160220.00BBPRTGroupsTA/TO000AA0Non Refund1.0NaN0Transient62.000Canceled21/10/2015
99City Hotel16152017March11160220.00BBPRTGroupsTA/TO000AA0Non Refund1.0NaN0Transient62.000Canceled21/10/2015

Last rows

Unnamed: 0hotelis_canceledlead_timearrival_date_yeararrival_date_montharrival_date_week_numberarrival_date_day_of_monthstays_in_weekend_nightsstays_in_week_nightsadultschildrenbabiesmealcountrymarket_segmentdistribution_channelis_repeated_guestprevious_cancellationsprevious_bookings_not_canceledreserved_room_typeassigned_room_typebooking_changesdeposit_typeagentcompanydays_in_waiting_listcustomer_typeadrrequired_car_parking_spacestotal_of_special_requestsreservation_statusreservation_status_date
119380119380City Hotel1652016July29140320.00BBPRTDirectDirect000EE0No Deposit14.0NaN0Transient139.5000Canceled10/5/2016
119381119381City Hotel1932016July29141321.00BBCHNOnline TATA/TO000AA0No Deposit9.0NaN0Transient109.3500Canceled19/04/2016
119382119382City Hotel1912016July29141320.00BBFRAOnline TATA/TO000DD0No Deposit9.0NaN0Transient118.8000Canceled22/04/2016
119383119383City Hotel11322016July29142421.00BBFRAOnline TATA/TO000AA0No Deposit9.0NaN0Transient114.7500Canceled23/04/2016
119384119384City Hotel11622016July29142520.00SCGBROnline TATA/TO000AA0No Deposit9.0NaN0Transient86.8200Canceled13/05/2016
119385119385City Hotel11502016July29142830.00BBFRAOnline TATA/TO000DD0No Deposit9.0NaN0Transient131.7502Canceled31/03/2016
119386119386City Hotel11502016July29142820.00BBFRAOnline TATA/TO000AA0No Deposit9.0NaN0Transient101.1501Canceled31/03/2016
119387119387City Hotel1602016July29150130.00BBPRTOnline TATA/TO000DD0No Deposit9.0NaN0Transient137.7001Canceled7/7/2016
119388119388City Hotel192016July29150130.00BBFRAOnline TATA/TO000DD0No Deposit9.0NaN0Transient197.0001Canceled13/07/2016
119389119389City Hotel01942016July29150120.00BBCHEOnline TATA/TO000BB1No Deposit9.0NaN0Transient-Party86.7500Check-Out16/07/2016